170 research outputs found
Privacy risk assessment of emerging machine learning paradigms
Machine learning (ML) has progressed tremendously, and data is the key factor to drive such development. However, there are two main challenges regarding collecting the data and handling it with ML models. First, the acquisition of high-quality labeled data can be difficult and expensive due to the need for extensive human annotation. Second, to model the complex relationship between entities, e.g., social networks or molecule structures, graphs have been leveraged. However, conventional ML models may not effectively handle graph data due to the non-linear and complex nature of the relationships between nodes. To address these challenges, recent developments in semi-supervised learning and self-supervised learning have been introduced to leverage unlabeled data for ML tasks. In addition, a new family of ML models known as graph neural networks has been proposed to tackle the challenges associated with graph data. Despite being powerful, the potential privacy risk stemming from these paradigms should also be taken into account. In this dissertation, we perform the privacy risk assessment of the emerging machine learning paradigms. Firstly, we investigate the membership privacy leakage stemming from semi-supervised learning. Concretely, we propose the first data augmentation-based membership inference attack that is tailored to the training paradigm of semi-supervised learning methods. Secondly, we quantify the privacy leakage of self-supervised learning through the lens of membership inference attacks and attribute inference attacks. Thirdly, we study the privacy implications of training GNNs on graphs. In particular, we propose the first attack to steal a graph from the outputs of a GNN model that is trained on the graph. Finally, we also explore potential defense mechanisms to mitigate these attacks.Maschinelles Lernen (ML) hat enorme Fortschritte gemacht, und Daten sind der Schlüsselfaktor, um diese Entwicklung voranzutreiben. Es gibt jedoch zwei große Herausforderungen bei der Erfassung der Daten und deren Handhabung mit ML-Modellen. Erstens kann die Erfassung qualitativ hochwertiger beschrifteter Daten aufgrund der Notwendigkeit umfangreicher menschlicher Anmerkungen schwierig und teuer sein. Zweitens wurden Graphen genutzt, um die komplexe Beziehung zwischen Entitäten, z. B. sozialen Netzwerken oder Molekülstrukturen, zu modellieren. Herkömmliche ML Modelle können Diagrammdaten jedoch aufgrund der nichtlinearen und komplexen Natur der Beziehungen zwischen Knoten möglicherweise nicht effektiv handhaben. Um diesen Herausforderungen zu begegnen, wurden jüngste Entwicklungen im halbüberwachten Lernen und im selbstüberwachten Lernen eingeführt, um unbeschriftete Daten für ML Aufgaben zu nutzen. Darüber hinaus wurde eine neue Familie von ML-Modellen, bekannt als Graph Neural Networks, vorgeschlagen, um die Herausforderungen im Zusammenhang mit Graphdaten zu bewältigen. Obwohl sie leistungsfähig sind, sollte auch das potenzielle Datenschutzrisiko berücksichtigt werden, das sich aus diesen Paradigmen ergibt. In dieser Dissertation führen wir die Datenschutzrisikobewertung der aufkommenden Paradigmen des maschinellen Lernens durch. Erstens untersuchen wir die Datenschutzlecks der Mitgliedschaft, die sich aus halbüberwachtem Lernen ergeben. Konkret schlagen wir den ersten auf Datenaugmentation basierenden Mitgliedschafts-Inferenz-Angriff vor, der auf das Trainingsparadigma halbüberwachter Lernmethoden zugeschnitten ist. Zweitens quantifizieren wir das Durchsickern der Privatsphäre des selbstüberwachten Lernens durch die Linse von Mitgliedschafts-Inferenz-Angriffen und Attribut-Inferenz- Angriffen. Drittens untersuchen wir die Datenschutzauswirkungen des Trainings von GNNs auf Graphen. Insbesondere schlagen wir den ersten Angriff vor, um einen Graphen aus den Ausgaben eines GNN-Modells zu stehlen, das auf dem Graphen trainiert wird. Schließlich untersuchen wir auch mögliche Verteidigungsmechanismen, um diese Angriffe abzuschwächen
Stealing Links from Graph Neural Networks
Graph data, such as chemical networks and social networks, may be deemed
confidential/private because the data owner often spends lots of resources
collecting the data or the data contains sensitive information, e.g., social
relationships. Recently, neural networks were extended to graph data, which are
known as graph neural networks (GNNs). Due to their superior performance, GNNs
have many applications, such as healthcare analytics, recommender systems, and
fraud detection. In this work, we propose the first attacks to steal a graph
from the outputs of a GNN model that is trained on the graph. Specifically,
given a black-box access to a GNN model, our attacks can infer whether there
exists a link between any pair of nodes in the graph used to train the model.
We call our attacks link stealing attacks. We propose a threat model to
systematically characterize an adversary's background knowledge along three
dimensions which in total leads to a comprehensive taxonomy of 8 different link
stealing attacks. We propose multiple novel methods to realize these 8 attacks.
Extensive experiments on 8 real-world datasets show that our attacks are
effective at stealing links, e.g., AUC (area under the ROC curve) is above 0.95
in multiple cases. Our results indicate that the outputs of a GNN model reveal
rich information about the structure of the graph used to train the model.Comment: To appear in the 30th Usenix Security Symposium, August 2021,
Vancouver, B.C., Canad
Test-Time Poisoning Attacks Against Test-Time Adaptation Models
Deploying machine learning (ML) models in the wild is challenging as it
suffers from distribution shifts, where the model trained on an original domain
cannot generalize well to unforeseen diverse transfer domains. To address this
challenge, several test-time adaptation (TTA) methods have been proposed to
improve the generalization ability of the target pre-trained models under test
data to cope with the shifted distribution. The success of TTA can be credited
to the continuous fine-tuning of the target model according to the
distributional hint from the test samples during test time. Despite being
powerful, it also opens a new attack surface, i.e., test-time poisoning
attacks, which are substantially different from previous poisoning attacks that
occur during the training time of ML models (i.e., adversaries cannot intervene
in the training process). In this paper, we perform the first test-time
poisoning attack against four mainstream TTA methods, including TTT, DUA, TENT,
and RPL. Concretely, we generate poisoned samples based on the surrogate models
and feed them to the target TTA models. Experimental results show that the TTA
methods are generally vulnerable to test-time poisoning attacks. For instance,
the adversary can feed as few as 10 poisoned samples to degrade the performance
of the target model from 76.20% to 41.83%. Our results demonstrate that TTA
algorithms lacking a rigorous security assessment are unsuitable for deployment
in real-life scenarios. As such, we advocate for the integration of defenses
against test-time poisoning attacks into the design of TTA methods.Comment: To Appear in the 45th IEEE Symposium on Security and Privacy, May
20-23, 202
MGTBench: Benchmarking Machine-Generated Text Detection
Nowadays large language models (LLMs) have shown revolutionary power in a
variety of natural language processing (NLP) tasks such as text classification,
sentiment analysis, language translation, and question-answering. In this way,
detecting machine-generated texts (MGTs) is becoming increasingly important as
LLMs become more advanced and prevalent. These models can generate human-like
language that can be difficult to distinguish from text written by a human,
which raises concerns about authenticity, accountability, and potential bias.
However, existing detection methods against MGTs are evaluated under different
model architectures, datasets, and experimental settings, resulting in a lack
of a comprehensive evaluation framework across different methodologies
In this paper, we fill this gap by proposing the first benchmark framework
for MGT detection, named MGTBench. Extensive evaluations on public datasets
with curated answers generated by ChatGPT (the most representative and powerful
LLMs thus far) show that most of the current detection methods perform less
satisfactorily against MGTs. An exceptional case is ChatGPT Detector, which is
trained with ChatGPT-generated texts and shows great performance in detecting
MGTs. Nonetheless, we note that only a small fraction of adversarial-crafted
perturbations on MGTs can evade the ChatGPT Detector, thus highlighting the
need for more robust MGT detection methods. We envision that MGTBench will
serve as a benchmark tool to accelerate future investigations involving the
evaluation of state-of-the-art MGT detection methods on their respective
datasets and the development of more advanced MGT detection methods. Our source
code and datasets are available at https://github.com/xinleihe/MGTBench
Generative Watermarking Against Unauthorized Subject-Driven Image Synthesis
Large text-to-image models have shown remarkable performance in synthesizing
high-quality images. In particular, the subject-driven model makes it possible
to personalize the image synthesis for a specific subject, e.g., a human face
or an artistic style, by fine-tuning the generic text-to-image model with a few
images from that subject. Nevertheless, misuse of subject-driven image
synthesis may violate the authority of subject owners. For example, malicious
users may use subject-driven synthesis to mimic specific artistic styles or to
create fake facial images without authorization. To protect subject owners
against such misuse, recent attempts have commonly relied on adversarial
examples to indiscriminately disrupt subject-driven image synthesis. However,
this essentially prevents any benign use of subject-driven synthesis based on
protected images.
In this paper, we take a different angle and aim at protection without
sacrificing the utility of protected images for general synthesis purposes.
Specifically, we propose GenWatermark, a novel watermark system based on
jointly learning a watermark generator and a detector. In particular, to help
the watermark survive the subject-driven synthesis, we incorporate the
synthesis process in learning GenWatermark by fine-tuning the detector with
synthesized images for a specific subject. This operation is shown to largely
improve the watermark detection accuracy and also ensure the uniqueness of the
watermark for each individual subject. Extensive experiments validate the
effectiveness of GenWatermark, especially in practical scenarios with unknown
models and text prompts (74% Acc.), as well as partial data watermarking (80%
Acc. for 1/4 watermarking). We also demonstrate the robustness of GenWatermark
to two potential countermeasures that substantially degrade the synthesis
quality
Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models
State-of-the-art Text-to-Image models like Stable Diffusion and DALLE2
are revolutionizing how people generate visual content. At the same time,
society has serious concerns about how adversaries can exploit such models to
generate unsafe images. In this work, we focus on demystifying the generation
of unsafe images and hateful memes from Text-to-Image models. We first
construct a typology of unsafe images consisting of five categories (sexually
explicit, violent, disturbing, hateful, and political). Then, we assess the
proportion of unsafe images generated by four advanced Text-to-Image models
using four prompt datasets. We find that these models can generate a
substantial percentage of unsafe images; across four models and four prompt
datasets, 14.56% of all generated images are unsafe. When comparing the four
models, we find different risk levels, with Stable Diffusion being the most
prone to generating unsafe content (18.92% of all generated images are unsafe).
Given Stable Diffusion's tendency to generate more unsafe content, we evaluate
its potential to generate hateful meme variants if exploited by an adversary to
attack a specific individual or community. We employ three image editing
methods, DreamBooth, Textual Inversion, and SDEdit, which are supported by
Stable Diffusion. Our evaluation result shows that 24% of the generated images
using DreamBooth are hateful meme variants that present the features of the
original hateful meme and the target individual/community; these generated
images are comparable to hateful meme variants collected from the real world.
Overall, our results demonstrate that the danger of large-scale generation of
unsafe images is imminent. We discuss several mitigating measures, such as
curating training data, regulating prompts, and implementing safety filters,
and encourage better safeguard tools to be developed to prevent unsafe
generation.Comment: To Appear in the ACM Conference on Computer and Communications
Security, November 26, 202
Deformation rule of bored pile & steel support for deep foundation pit in sandy pebble geology
Regarding the whole excavation process of the support system of the Southwest Jiaotong University Station of Chengdu Metro Line 6 (the deep foundation pit bored pile + steel support and support system) as the engineering background, this paper studies the deformation rule of the deep foundation pit bored pile + steel support of the sandy pebble foundation. The deformation rule of this support system, the settlement rule of the ground surface outside the pit, and the rule of the uplift of the loose at the bottom of the pit are studied. A key analysis of the positive corner of the foundation pit is conducted, and the rationality of the optimization of the support scheme is evaluated. This paper provides effective guidance for the subsequent deep foundation pit construction and provides a reference for deep foundation pit construction
- …